Compilation of Modelica Array Computations into Single Assignment C for Efficient Execution on CUDA-enabled GPUs

نویسندگان

Kristian Stavåker

Daniel Rolls

Jing Guo

Peter Fritzson

Sven-Bodo Scholz

چکیده

Mathematical models, derived for example from discretisation of partial differential equations, often contain operations over large arrays. In this work we investigate the possibility of compiling array operations from models in the equation-based language Modelica into Single Assignment C (SAC). The SAC2C SAC compiler can generate highly efficient code that, for instance, can be executed on CUDAenabled GPUs. We plan to enhance the open-source Modelica compiler OpenModelica, with capabilities to detect and compile data parallel Modelica for-equations/arrayequations into SAC WITH-loops. As a first step we demonstrate the feasibility of this approach by manually inserting calls to SAC array operations in the code generated from OpenModelica and show how capabilities and runtimes can be extended. As a second step we demostrate the feasibility of rewriting parts of the OpenModelica simulation runtime system in SAC. Finally, we discuss SAC2C’s switchable target architectures and demonstrate one by harnessing a CUDA-enabled GPU to improve runtimes. To the best of our knowledge, compilation of Modelica array operations for execution on CUDA-enabled GPUs is a new research area.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerating high-order WENO schemes using two heterogeneous GPUs

A double-GPU code is developed to accelerate WENO schemes. The test problem is a compressible viscous flow. The convective terms are discretized using third- to ninth-order WENO schemes and the viscous terms are discretized by the standard fourth-order central scheme. The code written in CUDA programming language is developed by modifying a single-GPU code. The OpenMP library is used for parall...

متن کامل

Numerical Simulation of a Lead-Acid Battery Discharge Process using a Developed Framework on Graphic Processing Units

In the present work, a framework is developed for implementation of finite difference schemes on Graphic Processing Units (GPU). The framework is developed using the CUDA language and C++ template meta-programming techniques. The framework is also applicable for other numerical methods which can be represented similar to finite difference schemes such as finite volume methods on structured grid...

متن کامل

Accelerating Quantum Chromodynamics Calculations with GPUs

We present a CUDA C implementation of the Conjugate Gradient (CG) and multi-mass CG solver from the MILC quantum chromodynamics package to speedup improved staggered quarks computations on NVIDIA GPUs. The implementation is built on the QUDA package from Boston University. Keywordsquantum chromodynamics; MILC; GPU

متن کامل

RLT2-based Parallel Algorithms for Solving Large Quadratic Assignment Problems on Graphics Processing Unit Clusters

This paper discusses efficient parallel algorithms for obtaining strong lower bounds and exact solutions for large instances of the Quadratic Assignment Problem (QAP). Our parallel architecture is comprised of both multi-core processors and Compute Unified Device Architecture (CUDA) enabled NVIDIA Graphics Processing Units (GPUs) on the Blue Waters Supercomputing Facility at the University of I...

متن کامل

Rlt2-based Parallel Algorithms for Solving Large Quadratic Assignment Problems on Graphics Processing Unit Clusters

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Compilation of Modelica Array Computations into Single Assignment C for Efficient Execution on CUDA-enabled GPUs

نویسندگان

چکیده

منابع مشابه

Accelerating high-order WENO schemes using two heterogeneous GPUs

Numerical Simulation of a Lead-Acid Battery Discharge Process using a Developed Framework on Graphic Processing Units

Accelerating Quantum Chromodynamics Calculations with GPUs

RLT2-based Parallel Algorithms for Solving Large Quadratic Assignment Problems on Graphics Processing Unit Clusters

Rlt2-based Parallel Algorithms for Solving Large Quadratic Assignment Problems on Graphics Processing Unit Clusters

عنوان ژورنال:

اشتراک گذاری